97 research outputs found
Approximations and Bounds for (n, k) Fork-Join Queues: A Linear Transformation Approach
Compared to basic fork-join queues, a job in (n, k) fork-join queues only
needs its k out of all n sub-tasks to be finished. Since (n, k) fork-join
queues are prevalent in popular distributed systems, erasure coding based cloud
storages, and modern network protocols like multipath routing, estimating the
sojourn time of such queues is thus critical for the performance measurement
and resource plan of computer clusters. However, the estimating keeps to be a
well-known open challenge for years, and only rough bounds for a limited range
of load factors have been given. In this paper, we developed a closed-form
linear transformation technique for jointly-identical random variables: An
order statistic can be represented by a linear combination of maxima. This
brand-new technique is then used to transform the sojourn time of non-purging
(n, k) fork-join queues into a linear combination of the sojourn times of basic
(k, k), (k+1, k+1), ..., (n, n) fork-join queues. Consequently, existing
approximations for basic fork-join queues can be bridged to the approximations
for non-purging (n, k) fork-join queues. The uncovered approximations are then
used to improve the upper bounds for purging (n, k) fork-join queues.
Simulation experiments show that this linear transformation approach is
practiced well for moderate n and relatively large k.Comment: 10 page
Reinforced Imitative Graph Representation Learning for Mobile User Profiling: An Adversarial Training Perspective
In this paper, we study the problem of mobile user profiling, which is a
critical component for quantifying users' characteristics in the human mobility
modeling pipeline. Human mobility is a sequential decision-making process
dependent on the users' dynamic interests. With accurate user profiles, the
predictive model can perfectly reproduce users' mobility trajectories. In the
reverse direction, once the predictive model can imitate users' mobility
patterns, the learned user profiles are also optimal. Such intuition motivates
us to propose an imitation-based mobile user profiling framework by exploiting
reinforcement learning, in which the agent is trained to precisely imitate
users' mobility patterns for optimal user profiles. Specifically, the proposed
framework includes two modules: (1) representation module, which produces state
combining user profiles and spatio-temporal context in real-time; (2) imitation
module, where Deep Q-network (DQN) imitates the user behavior (action) based on
the state that is produced by the representation module. However, there are two
challenges in running the framework effectively. First, epsilon-greedy strategy
in DQN makes use of the exploration-exploitation trade-off by randomly pick
actions with the epsilon probability. Such randomness feeds back to the
representation module, causing the learned user profiles unstable. To solve the
problem, we propose an adversarial training strategy to guarantee the
robustness of the representation module. Second, the representation module
updates users' profiles in an incremental manner, requiring integrating the
temporal effects of user profiles. Inspired by Long-short Term Memory (LSTM),
we introduce a gated mechanism to incorporate new and old user characteristics
into the user profile.Comment: AAAI 202
Reinforcement-Enhanced Autoregressive Feature Transformation: Gradient-steered Search in Continuous Space for Postfix Expressions
Feature transformation aims to generate new pattern-discriminative feature
space from original features to improve downstream machine learning (ML) task
performances. However, the discrete search space for the optimal feature
explosively grows on the basis of combinations of features and operations from
low-order forms to high-order forms. Existing methods, such as exhaustive
search, expansion reduction, evolutionary algorithms, reinforcement learning,
and iterative greedy, suffer from large search space. Overly emphasizing
efficiency in algorithm design usually sacrifices stability or robustness. To
fundamentally fill this gap, we reformulate discrete feature transformation as
a continuous space optimization task and develop an
embedding-optimization-reconstruction framework. This framework includes four
steps: 1) reinforcement-enhanced data preparation, aiming to prepare
high-quality transformation-accuracy training data; 2) feature transformation
operation sequence embedding, intending to encapsulate the knowledge of
prepared training data within a continuous space; 3) gradient-steered optimal
embedding search, dedicating to uncover potentially superior embeddings within
the learned space; 4) transformation operation sequence reconstruction,
striving to reproduce the feature transformation solution to pinpoint the
optimal feature space.Comment: Accepted by NeurIPS 202
Dish-TS: A General Paradigm for Alleviating Distribution Shift in Time Series Forecasting
The distribution shift in Time Series Forecasting (TSF), indicating series
distribution changes over time, largely hinders the performance of TSF models.
Existing works towards distribution shift in time series are mostly limited in
the quantification of distribution and, more importantly, overlook the
potential shift between lookback and horizon windows. To address above
challenges, we systematically summarize the distribution shift in TSF into two
categories. Regarding lookback windows as input-space and horizon windows as
output-space, there exist (i) intra-space shift, that the distribution within
the input-space keeps shifted over time, and (ii) inter-space shift, that the
distribution is shifted between input-space and output-space. Then we
introduce, Dish-TS, a general neural paradigm for alleviating distribution
shift in TSF. Specifically, for better distribution estimation, we propose the
coefficient net (CONET), which can be any neural architectures, to map input
sequences into learnable distribution coefficients. To relieve intra-space and
inter-space shift, we organize Dish-TS as a Dual-CONET framework to separately
learn the distribution of input- and output-space, which naturally captures the
distribution difference of two spaces. In addition, we introduce a more
effective training strategy for intractable CONET learning. Finally, we conduct
extensive experiments on several datasets coupled with different
state-of-the-art forecasting models. Experimental results show Dish-TS
consistently boosts them with a more than 20% average improvement. Code is
available.Comment: Accepted by AAAI 202
Semi-supervised Domain Adaptation in Graph Transfer Learning
As a specific case of graph transfer learning, unsupervised domain adaptation
on graphs aims for knowledge transfer from label-rich source graphs to
unlabeled target graphs. However, graphs with topology and attributes usually
have considerable cross-domain disparity and there are numerous real-world
scenarios where merely a subset of nodes are labeled in the source graph. This
imposes critical challenges on graph transfer learning due to serious domain
shifts and label scarcity. To address these challenges, we propose a method
named Semi-supervised Graph Domain Adaptation (SGDA). To deal with the domain
shift, we add adaptive shift parameters to each of the source nodes, which are
trained in an adversarial manner to align the cross-domain distributions of
node embedding, thus the node classifier trained on labeled source nodes can be
transferred to the target nodes. Moreover, to address the label scarcity, we
propose pseudo-labeling on unlabeled nodes, which improves classification on
the target graph via measuring the posterior influence of nodes based on their
relative position to the class centroids. Finally, extensive experiments on a
range of publicly accessible datasets validate the effectiveness of our
proposed SGDA in different experimental settings
Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective
Feature transformation aims to reconstruct an effective representation space
by mathematically refining the existing features. It serves as a pivotal
approach to combat the curse of dimensionality, enhance model generalization,
mitigate data sparsity, and extend the applicability of classical models.
Existing research predominantly focuses on domain knowledge-based feature
engineering or learning latent representations. However, these methods, while
insightful, lack full automation and fail to yield a traceable and optimal
representation space. An indispensable question arises: Can we concurrently
address these limitations when reconstructing a feature space for a
machine-learning task? Our initial work took a pioneering step towards this
challenge by introducing a novel self-optimizing framework. This framework
leverages the power of three cascading reinforced agents to automatically
select candidate features and operations for generating improved feature
transformation combinations. Despite the impressive strides made, there was
room for enhancing its effectiveness and generalization capability. In this
extended journal version, we advance our initial work from two distinct yet
interconnected perspectives: 1) We propose a refinement of the original
framework, which integrates a graph-based state representation method to
capture the feature interactions more effectively and develop different
Q-learning strategies to alleviate Q-value overestimation further. 2) We
utilize a new optimization technique (actor-critic) to train the entire
self-optimizing framework in order to accelerate the model convergence and
improve the feature transformation performance. Finally, to validate the
improved effectiveness and generalization capability of our framework, we
perform extensive experiments and conduct comprehensive analyses.Comment: 21 pages, submitted to TKDD. arXiv admin note: text overlap with
arXiv:2209.08044, arXiv:2205.1452
Self-Optimizing Feature Transformation
Feature transformation aims to extract a good representation (feature) space
by mathematically transforming existing features. It is crucial to address the
curse of dimensionality, enhance model generalization, overcome data sparsity,
and expand the availability of classic models. Current research focuses on
domain knowledge-based feature engineering or learning latent representations;
nevertheless, these methods are not entirely automated and cannot produce a
traceable and optimal representation space. When rebuilding a feature space for
a machine learning task, can these limitations be addressed concurrently? In
this extension study, we present a self-optimizing framework for feature
transformation. To achieve a better performance, we improved the preliminary
work by (1) obtaining an advanced state representation for enabling reinforced
agents to comprehend the current feature set better; and (2) resolving Q-value
overestimation in reinforced agents for learning unbiased and effective
policies. Finally, to make experiments more convincing than the preliminary
work, we conclude by adding the outlier detection task with five datasets,
evaluating various state representation approaches, and comparing different
training strategies. Extensive experiments and case studies show that our work
is more effective and superior.Comment: Under review of TKDE. arXiv admin note: substantial text overlap with
arXiv:2205.1452
- …